首页> 外文OA文献 >Variable selection for sparse Dirichlet-multinomial regression with an application to microbiome data analysis
【2h】

Variable selection for sparse Dirichlet-multinomial regression with an application to microbiome data analysis

机译:稀疏Dirichlet多项式回归的变量选择   应用于微生物组数据分析

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

With the development of next generation sequencing technology, researchershave now been able to study the microbiome composition using direct sequencing,whose output are bacterial taxa counts for each microbiome sample. One goal ofmicrobiome study is to associate the microbiome composition with environmentalcovariates. We propose to model the taxa counts using a Dirichlet-multinomial(DM) regression model in order to account for overdispersion of observedcounts. The DM regression model can be used for testing the association betweentaxa composition and covariates using the likelihood ratio test. However, whenthe number of covariates is large, multiple testing can lead to loss of power.To address the high dimensionality of the problem, we develop a penalizedlikelihood approach to estimate the regression parameters and to select thevariables by imposing a sparse group $\ell_1$ penalty to encourage bothgroup-level and within-group sparsity. Such a variable selection procedure canlead to selection of the relevant covariates and their associated bacterialtaxa. An efficient block-coordinate descent algorithm is developed to solve theoptimization problem. We present extensive simulations to demonstrate that thesparse DM regression can result in better identification of themicrobiome-associated covariates than models that ignore overdispersion or onlyconsider the proportions. We demonstrate the power of our method in an analysisof a data set evaluating the effects of nutrient intake on human gut microbiomecomposition. Our results have clearly shown that the nutrient intake isstrongly associated with the human gut microbiome.
机译:随着下一代测序技术的发展,研究人员现在已经能够使用直接测序研究微生物组的组成,其输出是每个微生物组样品的细菌类群计数。微生物组研究的一个目标是使微生物组组成与环境协变量相关联。我们建议使用Dirichlet多项式(DM)回归模型对分类单元计数进行建模,以解决观察到的计数的过度分散问题。 DM回归模型可用于使用似然比检验来测试分类单元组成与协变量之间的关联。但是,当协变量的数量很大时,多次测试可能会导致功效下降。为了解决问题的高维性,我们开发了一种惩罚似然法来估计回归参数并通过施加稀疏组$ \ ell_1 $来选择变量。鼓励组内和组内稀疏的惩罚。这样的变量选择过程可以导致选择相关的协变量及其相关的细菌类群。提出了一种有效的块坐标下降算法来解决优化问题。我们提供了广泛的模拟,以证明与忽略过度分散或仅考虑比例的模型相比,稀疏DM回归可以更好地识别与微生物组相关的协变量。我们在评估营养摄入量对人体肠道微生物组成的影响的数据集的分析中证明了我们方法的力量。我们的结果清楚地表明,营养摄入与人体肠道微生物组密切相关。

著录项

  • 作者

    Chen, Jun; Li, Hongzhe;

  • 作者单位
  • 年度 2013
  • 总页数
  • 原文格式 PDF
  • 正文语种 {"code":"en","name":"English","id":9}
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号